Skip to content

Conversation

@greenhat616
Copy link

@greenhat616 greenhat616 commented Oct 27, 2025

Description

Close #1294. Add JSON Schema dump feature.

Current blocked by crate-ci/git-conventional#88.

Motivation and Context

Add a gloabl opt --dump-context-schema to dump current version Context JSON Schema.

How Has This Been Tested?

Screenshots / Logs (if applicable)

Types of Changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (no code change)
  • Refactor (refactoring production code)
  • Other

Checklist:

  • My code follows the code style of this project.
  • I have updated the documentation accordingly.
  • I have formatted the code with rustfmt.
  • I checked the lints with clippy.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@welcome
Copy link

welcome bot commented Oct 27, 2025

Thanks for opening this pull request! Please check out our contributing guidelines! ⛰️

@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 47.91667% with 25 lines in your changes missing coverage. Please review.
✅ Project coverage is 43.03%. Comparing base (85cc05d) to head (572e139).

Files with missing lines Patch % Lines
git-cliff-core/src/changelog.rs 60.00% 12 Missing ⚠️
git-cliff/src/lib.rs 0.00% 9 Missing ⚠️
git-cliff/src/main.rs 0.00% 3 Missing ⚠️
git-cliff-core/src/remote/mod.rs 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1296      +/-   ##
==========================================
- Coverage   43.46%   43.03%   -0.43%     
==========================================
  Files          22       22              
  Lines        1972     1992      +20     
==========================================
  Hits          857      857              
- Misses       1115     1135      +20     
Flag Coverage Δ
unit-tests 43.03% <47.92%> (-0.43%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@ognis1205 ognis1205 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, please run the formatter with:

cargo +nightly fmt --all -- --verbose

and then run clippy with:

cargo clippy --tests --verbose -- -D warnings

to ensure the code passes both checks.

While using schemars to generate JSON Schemas in a code-first manner from the git-cliff-core and git-conventional models isn't necessarily a bad idea, both crates already expose their internal models publicly. Given that, adding schema-generation logic directly into the CLI might slightly drift away from the core purpose of git-cliff, which is primarily a changelog generator.

If you still prefer to support schema generation, you might consider isolating that functionality. For instance, by using remote deriving, you could maintain a small separate project that depends on git-cliff-core and git-conventional, and generate schemas there instead.

This would keep the CLI focused on its main responsibility while still enabling schema generation for external tooling.

@greenhat616
Copy link
Author

I have done the fmt and clippy check.

Regarding the Schema → SDK issue: I agree this workflow is not ideal in most cases. While Cliff does not define a stable Schema version, every change to Context can break existing interfaces—and when using Python or TS there is no type checking at runtime. Therefore, if we use a Schema, we must pin the current Cliff version.

Regarding Schema generation, after careful consideration, I approved for defining a separate remote definition for git-conventional. However, the Release type inside git-cliff-core is quite complex; defining it in a third-party library would require significant effort to adapt after each release.

This PR is just a draft idea for now. Maybe a better a approach: We could gate this with a feature flag and, via some process (scripts or an xtask, etc.), generate a Schema when Cliff publishes a release and upload it to the Releases artifacts. Would that be more appropriate?

@ognis1205
Copy link
Contributor

cc: @orhun

Thanks for clarifying. I understand this PR is still a draft.
However, it seems that some tests are failing. Have you run the tests locally to confirm?

Also, I’m a bit hesitant about this feature itself, as well as adding a related option to the CLI. We should consider whether this functionality is truly needed for most users. Even if it is, it might be cleaner, from a separation-of-concerns perspective, to add a separate binary under git-cliff-core rather than integrating it into the main CLI.

I agree this workflow is not ideal in most cases.

As you also noted, the current implementation direction seems problematic to me as well.

defining it in a third-party library would require significant effort to adapt after each release.
generate a Schema when Cliff publishes a release and upload it to the Releases artifacts.

In other words, what's being proposed here effectively shifts the maintenance burden to the maintainers, before we've even established a clear need for the feature. I think we should have a more thorough discussion about the necessity and maintenance cost of this functionality before proceeding further.

Personally, I would still recommend creating a separate repository dedicated to schema generation, which could depend on both git-cliff-core and git-conventional.

@orhun
Copy link
Owner

orhun commented Oct 29, 2025

Hey, sorry for the delay on this. I'll be having a look soon hopefully

@orhun
Copy link
Owner

orhun commented Nov 3, 2025

Although I like the changes in this PR, I agree that the output of this CLI command would be too specific to the current release of git-cliff. In other words, it's too dynamic and it might limit our capability to update the context without worrying about user setups that depend on this. We could maybe use a version key, but I believe it will be incremented very quickly if we add/remove/update fields.

I'm not sure how a separate repo would look like for this. Are there any example projects that do that we could get inspired from?

Thoughts?

@ognis1205
Copy link
Contributor

I don't know of any open-source projects that apply an implementation or operational approach like the one presented in this PR, but I believe there are internal tools that use JSON Schema to validate data type consistency.

I could try generating the schema using build.rs in a designated repository with the method I mentioned earlier. In particular, since the data models in git-cliff-core are expected to vary depending on feature flags (even though all backend feature flags are enabled by default in git-cliff), I think it would make more sense to manage the JSON Schemas in a separate dedicated repository, rather than deriving JsonSchema directly within git-cliff-core.

@orhun
Copy link
Owner

orhun commented Nov 4, 2025

Yeah, I think that's a fair approach. Thanks for looking into this!

@ognis1205
Copy link
Contributor

@greenhat616

By the way, I've re-read the related issue:

#1294

and I’m wondering — is git-cliff actually being used in the development cycle described there?

From my understanding, it doesn't seem like the issue's context involves using git-cliff directly. Could you clarify if I'm missing something?

@greenhat616
Copy link
Author

@ognis1205

By the way, I've re-read the related issue:

#1294

and I’m wondering — is git-cliff actually being used in the development cycle described there?

From my understanding, it doesn't seem like the issue's context involves using git-cliff directly. Could you clarify if I'm missing something?

The limited for git-cliff is not able to run scripts (lua, js etc), and query third-part API to query more detailed information to generate changelog. Compared with introducing a scripting engine for a custom preprocessing stage, adding type info to the existing Context—so third-party tools can handle it more easily—would require fewer changes, wouldn’t it?

For example, if you’re using third-party planning/project-management tools like Linear or Feishu and need to map your journal content (preferably only PRs) to task id and titles in those tools, then using Context with scripting for custom processing is a great fit.

Sry, the generator is internal tool, I cannot share it public.

I don't know of any open-source projects that apply an implementation or operational approach like the one presented in this PR, but I believe there are internal tools that use JSON Schema to validate data type consistency.

I could try generating the schema using build.rs in a designated repository with the method I mentioned earlier. In particular, since the data models in git-cliff-core are expected to vary depending on feature flags (even though all backend feature flags are enabled by default in git-cliff), I think it would make more sense to manage the JSON Schemas in a separate dedicated repository, rather than deriving JsonSchema directly within git-cliff-core.

Yes, I agree with this approach as well. This PR is mainly a prototype for the workflow described above. Publishing it in a separate repository, or bundling it with a Release artifact, would both be good options. External users could use this file to directly generate the Context for the current version. The schema metadata could be provided as a cliff-core feature rather than as a command flag in the cli.

Although I like the changes in this PR, I agree that the output of this CLI command would be too specific to the current release of git-cliff. In other words, it's too dynamic and it might limit our capability to update the context without worrying about user setups that depend on this. We could maybe use a version key, but I believe it will be incremented very quickly if we add/remove/update fields.

I'm not sure how a separate repo would look like for this. Are there any example projects that do that we could get inspired from?

Thoughts?

@orhun

Yes. Cliff is an actively evolving project, and the Context may change quite frequently. There’s no plan to provide stable type guarantees, and introducing a VERSION field to mark versions would be cumbersome. So treating this as something (a) exposed only behind a feature flag, (b) published alongside each release, or (c) maintained as a per-version schema in the repository are all acceptable trade-offs.

Asking users to define their own interfaces to adapt to the Context is too cumbersome and hard to validate—if the type definitions change substantially, it’s difficult for type checkers like tsc or pyright to automatically surface those changes. That would make upgrading cliff versions more painful.

@ognis1205
Copy link
Contributor

Thanks for the clarification @greenhat616 .

I still haven't fully grasped in your use case which specific field(s) of Context you intend to use and how you’d like to add Linear-related information (such as task IDs or titles) to which part of the Context.

query third-part API to query more detailed information to generate changelog.

I'm not familiar with Linear myself, but according to their docs, if you include the Linear issue ID in the PR title, Linear automatically links the PR to the issue. If that's the case, wouldn't it be possible to use this feature:

#1287

in git-cliff to convert issue IDs in PR titles into links to Linear issues during changelog generation?

Also, regarding schema validation: typically, schema or type validation is used at untrusted data boundaries. I'm not sure if git-cliff really falls into that category. To me, the motivation for providing a JSON Schema seems more like "avoiding the need to handcraft data models," rather than addressing data integrity or validation concerns.

If only certain fields of the context are problematic, it might be simpler to just treat the context as plain JSON and access or update the relevant fields via their paths — without deserializing the entire structure into a strict data model. After all, git-cliff doesn't seem to operate in an untrusted environment.

That said, it might be a good idea to first make sure this PR passes the tests and that the JSON Schema is properly generated. At the moment, it's hard to verify anything without that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provide JSON Schema of Context to allow strict type generation towards scripts

4 participants